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Summary 

Background The emergence and re-emergence of infectious diseases presents a significant challenge to public 
health and broader society. This study utilises novel nationwide data to calculate the transmission risk and potential 
inequity of infectious disease outbreaks through use of network analysis. 


Methods Nationwide employment and education microdata (~4.7 million individuals in Aotearoa New Zealand) 
were used to develop the Aotearoa Co-incidence Network (ACN). The ACN considers connections generated when 
individuals are employed at the same workplaces or enrolled at the same schools. Through forms of network analy- 
sis, connections between geospatial areas can be established and provide proxy measures of infectious disease trans- 
mission risk. The ACN was also overlayed with nationwide population vulnerability data based on the number of 
older adults (>65 years) and individuals with long-term health conditions. 


Findings We identify areas that have both high potential transmission risk (i.e., highly connected) and high vulnera- 
bility to infectious diseases. Community detection identified geographic boundaries that can be relevant to the appli- 
cation of regional restrictions for limiting infectious disease transmission. 


Interpretation Integrating novel network science and geospatial analytics provides a simple way to study infectious 
disease transmission risk and population vulnerability to outbreaks. Our replicable method has utility for research- 
ers globally with access to such data. It can help inform equitable preparation for, and responses to infectious disease 
outbreaks. 


Funding This project was funded by the Health Research Council of New Zealand (20/1442) and from the NZ Gov- 
ernment via Ministry for Business Innovation and Employment and Department of Prime Minister and Cabinet. 
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Background 

Infectious diseases are emerging and re-emerging at a 
rate not seen before.’ Considered alongside declining 
vaccination coverage, increasing antimicrobial resis- 
tance’ and inequity* these present some of the greatest 
challenges of the 21 century. The current COVID-19 
infectious disease pandemic has restricted usual 
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activities such as immunisation programmes and rou- 
tine hospital care in many countries across the globe.’ 
Consequently, a better understanding of potential infec- 
tious disease transmission will help future policy and 
research respond to new and re-emerging infectious dis- 
eases. With the increased collection of large complex 
data sets, such as those collected for administrative pur- 
poses, there is increased opportunity to understand 
potential transmission pathways prior to the develop- 
ment of an outbreak. In order to realise this opportu- 
nity, efforts are needed to provide generalisable 
methods that can make use of these complex forms of 
data. The current study seeks to provide one such 
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Research in Context 


Evidence before this study 


Network analysis has a strong history in epidemiology, 
as outlined by Danon et al.’ It has increasingly been 
used to quantify transmission risk for infectious dis- 
eases, based on analysis of contact, or interaction net- 
works. Network analysis can be combined with 
empirical data sources to help infer transmission con- 
texts and patterns of transmission between geospatial 
areas. Zhang et al.** and Irini et al.2° used geo-located 
mobile phone data to infer mobility, while Liu et al. 
used existing case data from cases of COVID-19 in Hong 
Kong. Irini et al.” found that workplaces served as 
important contexts for infectious disease transmission 
in Ireland, while Munday et al.2' conducted network 
analysis on census data to show how schools serve as 
important contexts for infectious disease transmission 
in the UK. 


Added value of this study 


We utilised an internationally unique source of individ- 
ual-level microdata, the Integrated Data Infrastructure, to 
create a publicly available tool named the Aotearoa Co- 
incidence Network (ACN). The ACN is a unique network 
containing over 600,000 connections between geospatial 
areas in Aotearoa New Zealand (NZ), generated through 
the combination of census, workplace, and education 
data, for ~4.7 million individuals. The connections in the 
ACN represent the number of potential shared inciden- 
ces between individuals in workplaces and schools. We 
use the ACN to estimate the transmission risk pathways 
due to interactions in the contexts of work and educa- 
tion. We employ a novel network analysis approach to 
identify important, highly connected geospatial areas 
across NZ. More specifically, we use network centrality as 
a proxy measure of transmission risk. This is based on 
both the strength and the structure of connections 
between different areas. We apply community detection 
to reveal spatially contiguous regional clusters within the 
ACN. Such regions can be used to guide the implementa- 
tion of non-pharmaceutical interventions such as “lock- 
downs”. Building on the growing body of research using 
network analysis to study infectious diseases, the ACN is 
a rare example of how networks can be used to repre- 
sent potential transmission pathways across a whole 
country. The ACN is a powerful tool as it can be used to 
inform public health responses to emerging pandemics, 
without the need for detailed case data or additional 
forms of data collection. The analytical methods pre- 
sented in this study are easily replicable and will have 
value to researchers across international contexts. Our 
study also demonstrates how the ACN can be combined 
with existing nationwide data on vulnerability in terms of 
existing long-term health conditions to explore the 
equity of infectious disease outbreaks. 


Implication of all the available evidence 


We identify several geospatial areas that carry a high 
risk of infectious disease transmission based on the 





number of connections that individuals share through 
workplaces and schools. The ACN also revealed broader 
regional clusters and communities that tend to be more 
densely connected. Community detection outlined a 
distinct set of regional geographic boundaries that can 
inform the application of any regionally specific strate- 
gies to mitigate transmission spread such as mandated 
use of masks, workplace/school closures, or increased 
testing should any outbreak of infectious disease occur. 
By integrating the ACN with nationwide geospatial data 
on population vulnerability, the current study also iden- 
tifies areas at high risk of transmission that are also 
highly vulnerable to an outbreak. Findings from the 
ACN were provided to the NZ government to assist in 
the response to the 2020 and 2021 outbreaks of COVID- 
19. The current study will be of interest to researchers, 
policymakers and communities located across the 
world, and in particular the Western Pacific, and 
between disciplines. All data, code, and an associated 
web application (https://stur600.shinyapps.io/aotearoa- 
coincidence-network/), is freely available and can be 
used as a tool to explore transmission risk and vulnera- 
bility at a fine-grained level. 











method of understanding potential transmission risk, 
by employing a novel form of network analysis to 
explore how regions are connected through shared 
interaction contexts. 

Network analysis is a useful tool for understanding 
the transmission of infectious diseases? with a strong 
history in the field of epidemiology.” Network structures 
consist of collections of nodes and links, where nodes 
can represent individuals or entities, and links represent 
the relationship between them. Networks can be com- 
plex: they may contain two or more node types (multi- 
partite), or multiple types of links (multigraphs). As an 
example, nodes representing individuals may connect 
to nodes representing workplaces. In such a network, 
workers would share indirect connections to their col- 
leagues through a common workplace node. Network 
analysis has been much used in the study of infectious 
diseases, whether that be sexually transmitted diseases” 
livestock diseases? and more recently, COVID-19."° 
Transmission naturally takes on the structure of an 
interaction network where connections shared between 
individuals may be used to represent links in a potential 
chain of transmission.” Such networks can be used to 
predict the distribution of infections following an out- 
break. 

Despite the value that can be gained through study- 
ing infectious disease transmission through network 
analysis, doing so can be challenging. Social network 
analysis is often employed to investigate infectious dis- 
eases spread through contacts; however, this often 
requires considerable resources and time-consuming 
surveys” or expensive equipment such as wireless sen- 
sor technology.” Emerging sources of large and com- 
plex data, often collected for administrative purposes, 
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have potential to help mitigate the challenges of these 
forms of data collection by inferring interaction con- 
texts." Such data can be used to address pressing global 
health challenges such as infectious disease outbreaks 
without the need for additional data collection.» More- 
over, the rapid digitalisation of health, education and 
broader social systems and significant improvements in 
data processing and storage capabilities have allowed 
researchers to access several sources of novel data not 
previously available, such as linked nationwide data on 
households, education, and employment.'® 

Efforts continue to be made to model infectious dis- 
ease transmission using complex systems approaches. '* 
With advances in computing power, the utility of net- 
works to analyse numerous sources of data have 
increased dramatically,'”"* and studies employing net- 
work analysis to understand disease transmission pat- 
terns are increasingly popular.'°'°*° Liu and 
colleagues'° demonstrated how networks can help 
uncover spatiotemporal transmission patterns, using 
existing data collected on COVID-19 cases across differ- 
ent districts in Hong Kong. Importantly, they argue that 
understanding connectivity across geospatial areas can 
help guide public health responses to future pandemics. 
Munday and colleagues” used census data from the UK 
to create a network of schools connected to households, 
and through this approach were able to assess the 
potential impact of school closures on COVID-19 trans- 
mission. Fewer studies have combined transmission 
risk data with an understanding of population vulnera- 
bility within particular geographical areas. This adds 
important contextual information about which areas 
may be most affected and informs an equitable public 
health response.** For instance, evidence has suggested 
certain individuals, such as those with medical condi- 
tions may have a higher risk of severe illness from 
COVID-19.7? 

Our study contributes a novel methodological 
approach combining different sources of nationwide 
data to develop the Aotearoa Co-incidence Network 
(ACN). The ACN is a network of geographic areas that 
may be used as a proxy measure of potential infectious 
disease transmission. The ACN considers the connec- 
tions shared between different geospatial areas through 
shared workplaces and schools. Both contexts serve as 
“hubs" which connect many individuals and have been 
identified as key contexts where disease transmission 
takes place.*"*? °° Previous studies, such as that con- 
ducted by Munday and colleagues*’ have used networks 
where schools are connected to households to study 
how schools may impact on infectious disease transmis- 
sion. While these forms of analysis are valuable, the cur- 
rent study will provide an alternative method that can 
contribute insights into potential transmission path- 
ways without the need to consider the location of 
schools or workplace. In addition to providing the 
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means to study potential transmission pathways in a 
way that preserves anonymity of schools and workpla- 
ces, a simplified network structure focusing on the geo- 
graphic areas in which people inhabit can provide 
additional important insights in terms of potential 
infectious disease transmission. The following section 
will outline how the ACN is constructed, how it can be 
used to derive proxy measures of geospatial transmis- 
sion risk, and how it may be combined with other sour- 
ces of data, such as health vulnerability, to assess the 
equity of potential outbreaks. 


Method 


Study design 

This was a nationwide, cross-sectional and geospatial 
study in Aotearoa New Zealand (NZ) approved by Statis- 
tics NZ (reference: MAA2020-36). All code and data 
used in the current are publicly available and can be 
accessed at: https://gitlab.com/tpm-public-projects/ 
aotearoa-connection-network. Network analysis was car- 
ried out in R*° using the igraph package.” 


Data source: Integrated Data Infrastructure (IDI) 
Nationwide data were obtained from the Integrated 
Data Infrastructure (IDI) for ~4.7 million individuals 
present in NZ census 2018 records, along with their 
household (dwelling of usual residence in census 2018), 
school enrolments (current enrolment data from Minis- 
try of Education) and place of employment (wages and 
salaries data from Inland Revenue). The IDI is a unique 
collection of individual-level linked microdata for people 
in NZ.** All data are linked and completely de-identified 
before being made available to researchers within a 
secure data lab.*® The IDI environment is only accessi- 
ble for researchers and projects approved by Statistics 
NZ. All extracted data are checked by Statistics NZ 
before release to ensure non-identifiability of individu- 
als or entities covered by the aggregated data, this 
includes suppression of low counts and random round- 
ing of satisfactorily high counts. 


Developing the Aotearoa Co-incidence Network (ACN) 
The Aotearoa Co-incidence Network (ACN) is con- 
structed by generating a network consisting of two 
sets of nodes (a bipartite network), where dwellings 
are linked to those schools/workplaces where inhabi- 
tants of the dwelling are enrolled/employed (Figure 1: 
Panel 1A). We then “project” onto the dwelling nodes 
in the network to obtain a network consisting of only 
dwellings, with links connecting dwellings when 
inhabitants share a workplace or school (Figure 1: 
Panel 1B). In addition to making the network simpler, 
the projection focuses the analysis on areas 


Articles 


Articles 





Figure 1. A simplified depiction of the development of the Aotearoa Co-incidence Network (ACN) for the case of five people from 
five different dwellings attending the same school. Panel 1A represents how dwellings in one area can share connections with 
dwellings in different areas through a shared interaction context (e.g., school). Panel 1B shows how the network can be simplified 
by focusing on the indirect connections shared by dwellings through that interaction context. This is simplified further by aggregat- 


ing across geospatial areas (Panel 1C). 


corresponding to individuals’ place of usual residence, 
making it possible to directly compare with data on 
regional vulnerabilities. The projected network is then 
aggregated into connections between the geospatial 
area units that contain the dwellings, defined by Sta- 
tistics NZ as Statistical Area 2 (SA2), to produce the 
ACN (Figure 1: Panel 1C). Importantly, the connec- 
tions shared between SA2s in the ACN can be consid- 
ered potential pathways by which infectious diseases 
may be transmitted. 


SA2s are part of the Statistical Standard for Geo- 
graphic Areas 2018 (SSGA2018)*? and can contain 
around up to 4,000 residents (rural SA2s may contain 
fewer than 1000 residents).*° We exclude SA2s located 
outside of NZ Territorial Authorities (TAs), leaving 
2,147 SA2s in the final network, connected by 669,878 
links. Each link has an associated value corresponding 
to the total number of combinations of inhabitants 
from the pair of SAzs who are co-employed or co- 
enrolled at a workplace or school. 
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Identifying connected communities 

We analyse the patterns of connections in the ACN 
using community detection’? which serves to partition 
the network into different communities of SA2s. Com- 
munities represent clusters of SAas that tend to be 
more strongly connected to areas within the same com- 
munity than they are to SA2s in other communities. 
Partitioning the ACN into these communities is a use- 
ful technique for coarse graining the network to a man- 
ageable size” and providing an overview of how 
different areas are connected. We use a method of com- 
munity detection that employs modularity maximisa- 
tion to identify the most robust communities. In simple 
terms, modularity maximisation identifies communities 
that result in the highest ratio of the number of links 
within groups, relative to those between groups. This 
method can be applied to the ACN based on all types of 
regional connections, or for different types of workplace 
and school connections separately. This is valuable, 
since the importance of different types of connection 
can vary depending on context (e.g., schools may be 
closed, only essential workplaces open). We tested vari- 
ous modularity maximisation community detection 
methods, which resulted in similar communities and 
modularity scores. 


Transmission risk and vulnerability 

We calculate a proxy measure of transmission risk in 
the ACN using PageRank centrality. Centrality meas- 
ures provide a valuable metric in identifying 
“important” nodes in a network based on the number 
and patterns of links shared between different nodes. 
There are a number of different metrics for computing 
different types of centrality” each containing different 
underlying assumptions about what “importance” 
means. The simplest centrality measure, degree central- 
ity, simply counts the number of connections (the 
degree) to each node; betweenness centrality counts the 
number of paths on a network, between arbitrary pairs 
of nodes, that pass through any specific nodes. In this 
study, we use PageRank - a flavour of eigenvector 
centrality. 7? 

PageRank centrality is an appropriate centrality mea- 
sure for infection on networks as it captures the concept 
of node “importance” in the sense of the probability of a 
random spreading process on a network visiting a spe- 
cific node. PageRank centrality has wide usage in stud- 
ies of infectious disease transmission, including studies 
investigating the importance of a geographic location in 
human flow networks” as well as research seeking to 
identify facilities most at risk from Bovine Viral Diar- 
rhea Virus.** For the current study, PageRank centrality 
is implemented using the igraph package?” in R*° by 
treating the links between SA2s as bidirectional (i.e., we 
assume transmission could originate from either of a 
pair of SAzs). PageRank then provides a proxy for 
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transmission risk by considering both the structure and 
strength of connections between SA2s” not only in the 
connections from one SA2 to its neighbours, but also 
the connections from those neighbours to next-nearest 
neighbour SA2s, and so on. An SA2 will be deemed 
high risk if the summed risk of its neighbours is high. 
This covers both the case when a SA2 has many neigh- 
bours and when a SA2 has a few high-risk neighbours. 

We overlaid the spatial estimates of transmission 
risk, derived from the ACN, with data detailing the vul- 
nerability of each geospatial area at SA2 level. Vulnera- 
bility data is taken from Wiki and colleagues” and is 
based on the proportion of the population with long- 
term health conditions (LTCs). LTCs were sourced from 
the National Minimum Dataset for the period 2011 
—2016. LTCs included cancer, cardiovascular condi- 
tions, diabetes, renal conditions, and respiratory ill- 
nesses. Due to the strong association between COVID- 
19 vulnerability and age, each of these factors is com- 
bined with the percentage of individuals over 65 years 
to generate a composite score of vulnerability for each 
SA2. 

To explore and visualise SA2s with high potential 
transmission risk and population vulnerability, we cre- 
ated a bivariate legend (Figure 2) which cross-refer- 
enced these sets of data, with each dimension split into 
tertiles (Figure 2). For instance, category ‘A’ would be 
low on both potential transmission risk and health vul- 
nerability while ‘F’ would be high potential transmis- 
sion risk, but low health vulnerability and ‘D’ would be 
high in terms of health vulnerability but low potential 
transmission risk. Perhaps the most important category 
is ‘I’ which indicates areas that rank in the highest ter- 
tile for both potential transmission risk and health vul- 
nerability. To aid exploration into the vulnerability we 
also report distributions at the level of TA areas — a 
much larger geospatial unit, which represent the second 
tier of local government below Regional Council 
areas.>° 


Role of Funding Source 

This project was funded by the Health Research Council 
of New Zealand (20/1442) and from the New Zealand 
Government via Ministry for Business Innovation and 
Employment and Department of Prime Minister and 
Cabinet contracts for modelling advice on responding to 
COVID-19 in Aotearoa New Zealand. The funding 
source had no role in study design, data collection, data 
analysis, interpretation, or writing of this report 


Results 


Aotearoa Co-incidence Network 

Figure 3 presents the connections shared between cen- 
tral SA2s across NZ cities from the ACN. We observe 
that most connections are made to SA2s near the 
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Higher Vulnerbability 











Higher Transmission 
Risk 


Figure 2. Bivariate legend of SA2s based on potential transmission risk and health vulnerability. Note: grey (category A) represents 
areas with low potential transmission risk and low vulnerability; blue (category F) represents areas with high potential transmission 
risk and low vulnerability; red (category D) represents areas with low potential transmission risk and high vulnerability; Purple (cate- 
gory |) represents areas with high potential transmission risk and high vulnerability. 


selected SA2, with fewer connections as distance 
increases. However, the ACN also reveals more detailed 
patterns. For example, Wellington Central has a signifi- 
cant number of connections to the north of the South 
Island. Detailed findings are available in a public-facing, 
interactive web application (see https://stur6oo.shi 
nyapps.io/aotearoa-coincidence-network/). 


Connected communities 

Through use of community detection, we also identified 
the geographic communities that are present in the 
ACN. We present the findings for all types (both work 
and education) of connections in the ACN. Geographic 
communities are indicated by the different colours in 
Figure 4A. The communities identified strongly reflect 
established local government (TA) boundaries with 
some exceptions. For example, the Auckland commu- 
nity extends further south than the Auckland TA bound- 
ary (see Figure 4B). Communities based solely on 


school and workplace connections are presented in sup- 
plementary material (Figure S1). 


Transmission risk 

Using the connections in the ACN we calculate the Pag- 
eRank centrality of each SA2 to identify areas that can 
be considered most at risk in terms of potential infec- 
tious disease transmission. We tested various measures 
of centrality (supplementary material, Figure S2), but 
present PageRank since it captures both strength of con- 
nections and the effect of relative risk from neighbour- 
ing connected areas. Figure 5A shows the distribution 
of PageRank scores across NZ at the level of SA2. Areas 
with increased potential transmission risk (i.e., higher 
PageRank centrality) tend to be found in clusters. This 
is reflected in the observation that PageRank establishes 
a clear partition between urban and rural areas (supple- 
mentary material, Figure $3) such that urban areas are 
more likely to have higher PageRank scores and thus 
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Figure 3. The number of connections from central SA2s in Auckland, Hamilton, Wellington, Christchurch, and Dunedin. Colour gra- 
dient represents the number of connections from the specified SA2 to other SA2s. 


higher potential transmission risk. We find that distri- 
butions vary across different regions (Figure 5B), with 
some appearing normal (e.g., Auckland), and others 
appearing multimodal (e.g., Hamilton). This highlights 
the diversity in regional transmission risk and the 
importance of understanding transmission risk at finer- 
grained geospatial levels. 


Combining transmission risk and population 
vulnerability 

Figure 6A combines potential transmission risk (Pag- 
eRank centrality) with estimates of health vulnerability» 
for each SA2 across NZ. Significant areas of the North 
and South Island were categorised as low risk tertiles 
(grey shaded areas, category A) for both potential trans- 
mission risk and health vulnerability. Despite this, sev- 
eral areas of the North and South Island were classified 
as high health vulnerability but low potential transmis- 
sion risk (red shaded areas, category D). Large parts of 
Wellington and Hamilton were classified as high poten- 
tial transmission risk but low health vulnerability (blue 
shaded areas, category F). High potential transmission 
risk and high health vulnerability were identified in sev- 
eral areas of the North and South Island including in 
Christchurch, Hamilton, Dunedin and Auckland. Dun- 
edin tends to have a higher proportion of SAzs that are 
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potentially high risk and high vulnerability, while Wel- 
lington is mainly composed of SA2s that have high 
potential transmission risk but low vulnerability 
(Figure 6B). 


Discussion 

Understanding the connections between people and 
places is important for modelling potential infectious 
disease transmission.'°"° The current study used a 
unique and replicable methodology based in network 
science to estimate potential transmission risk for com- 
municable disease across geospatial regions. We com- 
bined several sources of nationwide data in NZ on 
dwellings, schools, and workplaces to produce the ACN. 
The ACN is highly informative for understanding 
potential infectious disease transmission pathways to 
calculate proxy measures of infectious disease transmis- 
sion risk. 

Our study presented a novel method for determining 
geographical boundaries, generated through commu- 
nity detection on the ACN. The utility of the ACN there- 
fore extends to informing application of non- 
pharmaceutical interventions, helping to curb the 
spread of infectious disease” widely employed in the 
Western Pacific and around the world.***° This 
includes approaches such as such as defining regional 
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Figure 4. Communities detected in the ACN for the whole of New Zealand (Panel 4A), and for the Auckland region specifically (Panel 


4B). Territorial Authority boundaries are outlined in black. 


boundaries or mandating use of masks, workplace/ 
school closures or increased testing.*’ Such findings will 
inform the allocation of resources across NZ and identify 
communities that are at high risk of transmission and 
help NZ government make decisions to contain the 
spread of the community outbreaks of COVID-19.** 
Prior to the development of the ACN, during the Auck- 
land regional lockdown in February 2020, police check- 
points were extended further south.** (Figure S3B) to 
allow for the number of commuters that needed to cross 
the Auckland TA boundary, even with reduced numbers 
of people working on-site during the lockdown period. 
The ‘Auckland community’ detected in the ACN (Figure 
S4A) provides a better fit to this adapted boundary com- 
pared to the original TA boundary, which may validate 
the ACN’s capability and impact as a planning tool. 


Our study shows that centrality, or the extent of con- 
nectedness between areas, is higher in urban areas. We 
confirm findings observed in other studies that a higher 
density of inhabitants and potential close contact 
between people, for instance in urban areas** may pro- 
vide ideal conditions for the rapid spread of infectious 
diseases.*? Despite this, other evidence shows that risk 
of transmission often varies across continents, across a 
country, and even within a city.'°*> While sufficiently 
aggregated to preserve privacy, the ACN reveals at a fine 
geographical scale where variations in potential trans- 
mission risk may exist and shows that areas with high 
levels of centrality tend to be clustered together geospa- 
tially. This observation reflects Tobler’s first rule of 
geography:*° “everything is related to everything else, 
but near things are more related than distant things.” 
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Figure 5. PageRank centrality scores across Aotearoa New Zealand (Panel 5A) and associated distributions across SA2s located in 
the major Territorial Authorities of Auckland, Christchurch, Hamilton, Dunedin, and Wellington (Panel 5B). The colour gradient across 
both Panel 5A and 5B represent PageRank centrality, where higher PageRank provides an indication of higher potential transmission 
risk. 


In addition to identifying areas with high potential areas may be more vulnerable to infectious disease, in 
transmission risk, it is also important to account for vul- terms of age, health conditions and access or lack of 
nerable populations residing within such areas.” Some access to healthcare services.** Our study highlights 
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Figure 6. Combined potential transmission risk and population vulnerability based on long-term health conditions (Panel 6A) and 
distribution of risk for each SA2 located in the major territorial authorities of Dunedin, Christchurch, Hamilton, Auckland and Wel- 
lington (Panel 6B). Colours indicate levels of potential transmission risk and health vulnerability, as represented in Figure 2. 


several geographic areas at risk of infectious disease 
transmission containing a high proportion of the popula- 
tion with long-term health conditions. Previous research 
highlights which areas may be vulnerable” where super- 
spreading events may have occurred** or areas that may 
have inequitable access to vaccination centres‘? but sel- 
dom have such data been combined with a proxy for 
transmission risk as provided by the ACN. 

There are several policy implications from the ACN. 
Considering that infectious diseases are emerging and 


re-emerging at a rate not seen before’ our methodology, 
based on novel network and geospatial data science, pro- 
vides a simple yet effective way to estimate potential 
transmission risk to any infectious disease outbreak. 
Importantly, it is replicable and has application for any 
researcher with access to sources of linked data. An 
enhanced understanding of how different areas may be 
connected and the vulnerability of the population within 
those areas are important for governments and commu- 
nities in mitigating the spread of infectious diseases.'° 
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The utility of the ACN was further evidenced in its use 
as a resource for informing the NZ government’s 
response to COVID-19, with technical reports provided 
to the government during outbreaks of COVID-19 in 
2021.49 

A strength of the ACN is its whole population 
approach, using relatively comprehensive nationwide 
data from the IDI (n~4.7 million), but limitations 
within the IDI data do exist.” The New Zealand census 
population is limited in its representation of the current 
day NZ population, with it being smaller than 2021 esti- 
mates.°° However, the current study can be easily repro- 
duced with future census data, and by other researchers 
around the world with similar data. Another limitation 
is that there is a slight temporal mismatch in the census 
data used to build the ACN (2018) and the data on vul- 
nerability (2011-2016), while the proportion of individu- 
als with long-term health conditions within the 
vulnerability data is also just one aspect of vulnerability. 
Future work could extend the ACN to look at more 
detailed information regarding the types of workplaces 
and schools where individuals share an interaction con- 
text and consider more information regarding popula- 
tion vulnerabilities. For example, healthcare and 
hospitality workplaces may have more risky interactions 
compared to the information technology sector” while 
secondary schools may serve as larger hubs in a net- 
work, compared to primary schools.*’ 


Conclusion 

In a time where infectious diseases are continuing to 
emerge, efforts to make increased use of existing sour- 
ces of data are especially valuable. We outline a novel, 
reproducible methodology where we represent shared 
interaction contexts through workplaces and schools as 
the ACN. Our results at a broad level show that the areas 
of most risk within the ACN are urban areas, and areas 
of high risk are often clustered within close proximity to 
one another. However, the ACN provides the granular- 
ity to explore risk and vulnerability at a detailed level 
that can help inform regional responses to outbreaks of 
infectious disease. The ACN is a powerful tool for 
informing responses to outbreaks of infectious disease, 
and its reproducible methodology makes it relevant for 
researchers across the world. 


Disclaimer 

Access to the data used in this study was provided by 
Stats NZ under conditions designed to give effect to the 
security and confidentiality provisions of the Statistics 
Act 1975. The results presented in this study are the 
work of the author, not Stats NZ or individual data sup- 
pliers. The results are based in part on tax data supplied 
by Inland Revenue to Stats NZ under the Tax Adminis- 
tration Act 1994 for statistical purposes. Any discussion 
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of data limitations or weaknesses is in the context of 
using the IDI for statistical purposes, and is not related 
to the data’s ability to support Inland Revenue’s core 
operational requirements. 
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